Agglomerative Mean-Shift Clustering via Query Set Compression

نویسندگان

  • Xiao-Tong Yuan
  • Bao-Gang Hu
  • Ran He
چکیده

Mean-Shift (MS) is a powerful non-parametric clustering method. Although good accuracy can be achieved, its computational cost is particularly expensive even on moderate data sets. In this paper, for the purpose of algorithm speedup, we develop an agglomerative MS clustering method called Agglo-MS, along with its mode-seeking ability and convergence property analysis. Our method is built upon an iterative query set compression mechanism which is motivated by the quadratic bounding optimization nature of MS. The whole framework can be efficiently implemented in linear running time complexity. Furthermore, we show that the pairwise constraint information can be naturally integrated into our framework to derive a semi-supervised non-parametric clustering method. Extensive experiments on toy and real-world data sets validate the speedup advantage and numerical accuracy of our method, as well as the superiority of its semi-supervised version.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comparison of Agglomerative and Partitional Document Clustering Algorithms

Fast and high-quality document clustering algorithms play an important role in providing intuitive navigation and browsing mechanisms by organizing large amounts of information into a small number of meaningful clusters, and in greatly improving the retrieval performance either via cluster-driven dimensionality reduction, term-weighting, or query expansion. This ever-increasing importance of do...

متن کامل

Agglomerative Mean Shift Cluster Using Shortest Path and Fuzzification Algorithm

In this research paper, an agglomerative mean shift with fuzzy clustering algorithm for numerical data and image data, an extension to the standard fuzzy C-Means algorithm by introducing a penalty term to the objective function to make the clustering process not sensitive to the initial cluster centers. The new algorithm of Shortest path and Fuzzification algorithm can produce more consistent c...

متن کامل

Implementation of Hybrid Clustering Algorithm with Enhanced K-Means and Hierarchal Clustering

We are propose a hybrid clustering method, the methodology combines the strengths of both partitioning and agglomerative clustering methods. Clustering algorithms that build meaningful hierarchies out of large document collections are ideal tools for their interactive visualization and exploration as they provide data-views that are consistent, predictable, and at different levels of granularit...

متن کامل

Anisotropic Agglomerative Adaptive Mean-Shift

Mean Shift today, is widely used for mode detection and clustering. The technique though, is challenged in practice due to assumptions of isotropicity and homoscedasticity. We present an adaptive Mean Shift methodology that allows for full anisotropic clustering, through unsupervised local bandwidth selection. The bandwidth matrices evolve naturally, adapting locally through agglomeration, and ...

متن کامل

Improving Search Engine For A Digital Library

This project introduces a novel approach for using click through data to discover clusters of similar queries and similar URLs. One can simply observe that users reformulate their queries to find a desirable result. We define this sequence of queries as a query chain. Our data set consists of records containing user id, query term, query date and time, clicked item URL and clicked item rank if ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009